{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "lang": "fr"
   },
   "source": [
    "<center>\n",
    "<img src=\"https://habrastorage.org/files/fd4/502/43d/fd450243dd604b81b9713213a247aa20.jpg\">\n",
    "    \n",
    "## [mlcourse.ai](https://mlcourse.ai) - Open Machine Learning Course\n",
    "Auteur: [Yury Kashnitskiy](https://yorko.github.io) (@yorko). Edité et traduit par [Ousmane Cissé](https://github.com/oussou-dev). Ce matériel est soumis aux termes et conditions de la licence [Creative Commons CC BY-NC-SA 4.0](https://creativecommons.org/licenses/by-nc-sa/4.0/). L'utilisation gratuite est autorisée à des fins non commerciales.</center>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "lang": "fr"
   },
   "source": [
    "## <center> Mission 4. Détection des sarcasmes avec la régression logistique. Solution\n",
    "    \n",
    "Nous utiliserons le jeu de données de [paper](https://arxiv.org/abs/1704.05579) \"Un grand corpus auto-annoté pour le sarcasme\" avec plus de 1 million de commentaires provenant du site Reddit, étiquetés comme sarcastiques ou non. Une version complète peut être trouvée sur Kaggle sous la forme d'un [Jeu de données Kaggle](https://www.kaggle.com/danofer/sarcasm).</center>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "_uuid": "ed87ab2845921166bb73ca854bfe1ef013c035e9"
   },
   "outputs": [],
   "source": [
    "PATH_TO_DATA = \"../input/sarcasm/train-balanced-sarcasm.csv\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "_uuid": "ffa03aec57ab6150f9bec0fa56cd3a5791a3e6f4"
   },
   "outputs": [],
   "source": [
    "# some necessary imports\n",
    "import os\n",
    "\n",
    "import numpy as np\n",
    "import pandas as pd\n",
    "import seaborn as sns\n",
    "from matplotlib import pyplot as plt\n",
    "from sklearn.feature_extraction.text import TfidfVectorizer\n",
    "from sklearn.linear_model import LogisticRegression\n",
    "from sklearn.metrics import accuracy_score, confusion_matrix\n",
    "from sklearn.model_selection import train_test_split\n",
    "from sklearn.pipeline import Pipeline"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "_uuid": "b23e4fc7a1973d60e0c6da8bd60f3d921542a856"
   },
   "outputs": [],
   "source": [
    "train_df = pd.read_csv(PATH_TO_DATA)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "_uuid": "4dc7b3787afa46c7eb0d0e33b0c41ab9821c4a27"
   },
   "outputs": [],
   "source": [
    "train_df.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "_uuid": "0a7ed9557943806c6813ad59c3d5ebdb403ffd78"
   },
   "outputs": [],
   "source": [
    "train_df.info()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "lang": "fr"
   },
   "source": [
    "Certains commentaires sont manquants, alors nous supprimons les lignes correspondantes."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "_uuid": "97b2d85627fcde52a506dbdd55d4d6e4c87d3f08"
   },
   "outputs": [],
   "source": [
    "train_df.dropna(subset=[\"comment\"], inplace=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "lang": "fr"
   },
   "source": [
    "Nous remarquons que le jeu de données est bien équilibré"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "_uuid": "addd77c640423d30fd146c8d3a012d3c14481e11"
   },
   "outputs": [],
   "source": [
    "train_df[\"label\"].value_counts()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "lang": "fr"
   },
   "source": [
    "Nous divisons les données en données d'entraînement et données de validation."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "_uuid": "c200add4e1dcbaa75164bbcc73b9c12ecb863c96"
   },
   "outputs": [],
   "source": [
    "train_texts, valid_texts, y_train, y_valid = train_test_split(\n",
    "    train_df[\"comment\"], train_df[\"label\"], random_state=17\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "lang": "fr"
   },
   "source": [
    "## Les tâches:\n",
    "1. Analysez le jeu de données, faites des graphiques. Ce [kernel](https://www.kaggle.com/sudalairajkumar/simple-exploration-notebook-qiqc) pourrait servir d'exemple.\n",
    "2. Créez un Tf-Idf + un pipeline de régression logistique pour prédire le sarcasme (`label`) en fonction du texte d'un commentaire sur Reddit (` comment`).\n",
    "3. Tracer les mots/bigrammes qui sont les plus prédictifs du sarcasme (vous pouvez utiliser [eli5](https://github.com/TeamHG-Memex/eli5) pour cela)\n",
    "4. (facultatif) ajouter des sous-titres (subreddits) en tant que nouvelles caractéristiques pour améliorer les performances du modèle. Appliquez ici l’approche Sac de mots (Bag of Words), c’est-à-dire traitez chaque sous-titre comme une nouvelle caractéristique."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "lang": "fr"
   },
   "source": [
    "### Partie 1. Analyse exploratoire des données"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "lang": "fr"
   },
   "source": [
    "La distribution des longueurs pour les commentaires sarcastiques et normaux est presque la même."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "_uuid": "dadaf341602993a7854867a1df3004d0aa5d9b8c"
   },
   "outputs": [],
   "source": [
    "train_df.loc[train_df[\"label\"] == 1, \"comment\"].str.len().apply(np.log1p).hist(\n",
    "    label=\"sarcastic\", alpha=0.5\n",
    ")\n",
    "train_df.loc[train_df[\"label\"] == 0, \"comment\"].str.len().apply(np.log1p).hist(\n",
    "    label=\"normal\", alpha=0.5\n",
    ")\n",
    "plt.legend();"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "_uuid": "c2c613ee2052a2c0379682adf5c23d1f751f4c3b"
   },
   "outputs": [],
   "source": [
    "from wordcloud import STOPWORDS, WordCloud"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "_uuid": "ae7333d67f6a17673d2aa16aed3017e2fbef9b58"
   },
   "outputs": [],
   "source": [
    "wordcloud = WordCloud(\n",
    "    background_color=\"black\",\n",
    "    stopwords=STOPWORDS,\n",
    "    max_words=200,\n",
    "    max_font_size=100,\n",
    "    random_state=17,\n",
    "    width=800,\n",
    "    height=400,\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "lang": "fr"
   },
   "source": [
    "Le nuage de mots est beau, mais pas très utile"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "_uuid": "7bc095d9b0c549d4a21478ed52845210c0ffbb57"
   },
   "outputs": [],
   "source": [
    "plt.figure(figsize=(16, 12))\n",
    "wordcloud.generate(str(train_df.loc[train_df[\"label\"] == 1, \"comment\"]))\n",
    "plt.imshow(wordcloud);"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "_uuid": "f2423d8c8adf818c0f7c709665f741e1f885e47f"
   },
   "outputs": [],
   "source": [
    "plt.figure(figsize=(16, 12))\n",
    "wordcloud.generate(str(train_df.loc[train_df[\"label\"] == 0, \"comment\"]))\n",
    "plt.imshow(wordcloud);"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "lang": "fr"
   },
   "source": [
    "Analysons si certains sous-titres (subreddits) sont plus \"sarcastiques\" en moyenne que d'autres"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "_uuid": "6ba84720d54144e054b6963d78b48bf648e5c652"
   },
   "outputs": [],
   "source": [
    "sub_df = train_df.groupby(\"subreddit\")[\"label\"].agg([np.size, np.mean, np.sum])\n",
    "sub_df.sort_values(by=\"sum\", ascending=False).head(10)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "_uuid": "575136be3a080cdd9a93ee7ee0c5fc9d9bf754e9"
   },
   "outputs": [],
   "source": [
    "sub_df[sub_df[\"size\"] > 1000].sort_values(by=\"mean\", ascending=False).head(10)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "lang": "fr"
   },
   "source": [
    "La même chose pour les auteurs ne donne pas beaucoup de perspicacité. Hormis le fait que les commentaires de quelqu'un aient été échantillonnés, nous pouvons constater le même nombre de commentaires sarcastiques et non sarcastiques."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "_uuid": "1cde1056ef0d3c62b98d70a299d83df29fcf6071"
   },
   "outputs": [],
   "source": [
    "sub_df = train_df.groupby(\"author\")[\"label\"].agg([np.size, np.mean, np.sum])\n",
    "sub_df[sub_df[\"size\"] > 300].sort_values(by=\"mean\", ascending=False).head(10)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "_uuid": "46f4aa1c9524d525acec04ca0754f85a5190514b"
   },
   "outputs": [],
   "source": [
    "sub_df = (\n",
    "    train_df[train_df[\"score\"] >= 0]\n",
    "    .groupby(\"score\")[\"label\"]\n",
    "    .agg([np.size, np.mean, np.sum])\n",
    ")\n",
    "sub_df[sub_df[\"size\"] > 300].sort_values(by=\"mean\", ascending=False).head(10)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "_uuid": "3e2f474c6219798c2fef2e4004ad4a6c1022f8ee"
   },
   "outputs": [],
   "source": [
    "sub_df = (\n",
    "    train_df[train_df[\"score\"] < 0]\n",
    "    .groupby(\"score\")[\"label\"]\n",
    "    .agg([np.size, np.mean, np.sum])\n",
    ")\n",
    "sub_df[sub_df[\"size\"] > 300].sort_values(by=\"mean\", ascending=False).head(10)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "lang": "fr"
   },
   "source": [
    "### Partie 2. Entraîner le modèle"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "_uuid": "3048a070a56b08eb4e5fe2c54b6d14905031e74a"
   },
   "outputs": [],
   "source": [
    "# build bigrams, put a limit on maximal number of features\n",
    "# and minimal word frequency\n",
    "tf_idf = TfidfVectorizer(ngram_range=(1, 2), max_features=50000, min_df=2)\n",
    "# multinomial logistic regression a.k.a softmax classifier\n",
    "logit = LogisticRegression(C=1, n_jobs=4, solver=\"lbfgs\", random_state=17, verbose=1)\n",
    "# sklearn's pipeline\n",
    "tfidf_logit_pipeline = Pipeline([(\"tf_idf\", tf_idf), (\"logit\", logit)])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "_uuid": "8756bac7457218e4daf08ec276211f03971c17fb"
   },
   "outputs": [],
   "source": [
    "%%time\n",
    "tfidf_logit_pipeline.fit(train_texts, y_train)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "_uuid": "d2e47f77f999c2fb5aee9ef1de1542bc93de4c98"
   },
   "outputs": [],
   "source": [
    "%%time\n",
    "valid_pred = tfidf_logit_pipeline.predict(valid_texts)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "_uuid": "a8f93efc3db12910eaa6d7944feebb2418714203"
   },
   "outputs": [],
   "source": [
    "accuracy_score(y_valid, valid_pred)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "lang": "fr"
   },
   "source": [
    "### Partie 3. Expliquer le modèle"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "_uuid": "247a13fd3ae4d5c015c0ca0489a9a95d72ad7e9f"
   },
   "outputs": [],
   "source": [
    "def plot_confusion_matrix(\n",
    "    actual,\n",
    "    predicted,\n",
    "    classes,\n",
    "    normalize=False,\n",
    "    title=\"Confusion matrix\",\n",
    "    figsize=(7, 7),\n",
    "    cmap=plt.cm.Blues,\n",
    "    path_to_save_fig=None,\n",
    "):\n",
    "    \"\"\"\n",
    "    This function prints and plots the confusion matrix.\n",
    "    Normalization can be applied by setting `normalize=True`.\n",
    "    \"\"\"\n",
    "    import itertools\n",
    "\n",
    "    from sklearn.metrics import confusion_matrix\n",
    "\n",
    "    cm = confusion_matrix(actual, predicted).T\n",
    "    if normalize:\n",
    "        cm = cm.astype(\"float\") / cm.sum(axis=1)[:, np.newaxis]\n",
    "\n",
    "    plt.figure(figsize=figsize)\n",
    "    plt.imshow(cm, interpolation=\"nearest\", cmap=cmap)\n",
    "    plt.title(title)\n",
    "    plt.colorbar()\n",
    "    tick_marks = np.arange(len(classes))\n",
    "    plt.xticks(tick_marks, classes, rotation=90)\n",
    "    plt.yticks(tick_marks, classes)\n",
    "\n",
    "    fmt = \".2f\" if normalize else \"d\"\n",
    "    thresh = cm.max() / 2.0\n",
    "    for i, j in itertools.product(range(cm.shape[0]), range(cm.shape[1])):\n",
    "        plt.text(\n",
    "            j,\n",
    "            i,\n",
    "            format(cm[i, j], fmt),\n",
    "            horizontalalignment=\"center\",\n",
    "            color=\"white\" if cm[i, j] > thresh else \"black\",\n",
    "        )\n",
    "\n",
    "    plt.tight_layout()\n",
    "    plt.ylabel(\"Predicted label\")\n",
    "    plt.xlabel(\"True label\")\n",
    "\n",
    "    if path_to_save_fig:\n",
    "        plt.savefig(path_to_save_fig, dpi=300, bbox_inches=\"tight\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "lang": "fr"
   },
   "source": [
    "La matrice de confusion est assez équilibrée."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "_uuid": "6df0c058a45b48b756e57e01a23bbc0974407195"
   },
   "outputs": [],
   "source": [
    "plot_confusion_matrix(\n",
    "    y_valid,\n",
    "    valid_pred,\n",
    "    tfidf_logit_pipeline.named_steps[\"logit\"].classes_,\n",
    "    figsize=(8, 8),\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "lang": "fr"
   },
   "source": [
    "En effet, nous pouvons reconnaître certaines expressions indiquant un sarcasme. Comme \"oui bien sûr\"."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "_uuid": "f62f3043b6e94fb6bbd5683a0e9662c572847fa6"
   },
   "outputs": [],
   "source": [
    "import eli5\n",
    "\n",
    "eli5.show_weights(\n",
    "    estimator=tfidf_logit_pipeline.named_steps[\"logit\"],\n",
    "    vec=tfidf_logit_pipeline.named_steps[\"tf_idf\"],\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "_uuid": "be94f2065f86c65d5ee5590c9b2e5a541135732c"
   },
   "source": [
    "So sarcasm detection is easy.\n",
    "<img src=\"https://habrastorage.org/webt/1f/0d/ta/1f0dtavsd14ncf17gbsy1cvoga4.jpeg\">"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "lang": "fr"
   },
   "source": [
    "### Partie 4. Améliorer le modèle"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "_uuid": "aaefd2eb6f829f9eb9e9cd12c7903d3086182acc"
   },
   "outputs": [],
   "source": [
    "subreddits = train_df[\"subreddit\"]\n",
    "train_subreddits, valid_subreddits = train_test_split(subreddits, random_state=17)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "lang": "fr"
   },
   "source": [
    "Nous aurons des vectoriseurs Tf-Idf séparés pour les commentaires et les sous-titres (subreddits). Il est également possible de s'en tenir à un pipeline, mais dans ce cas, cela devient un peu moins simple. [Exemple](https://stackoverflow.com/questions/36731813/computing-separate-tfidf-scores-for-two-dwifferent-columns-using-sklearn)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "_uuid": "88be690ae260d824fbd8df4a4d02e5abcce0d5a7"
   },
   "outputs": [],
   "source": [
    "tf_idf_texts = TfidfVectorizer(ngram_range=(1, 2), max_features=50000, min_df=2)\n",
    "tf_idf_subreddits = TfidfVectorizer(ngram_range=(1, 1))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "lang": "fr"
   },
   "source": [
    "Faites des transformations séparément pour les commentaires et les sous-titres."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "_uuid": "bfd35a513a3a090485df667e8ec773c57682aae7"
   },
   "outputs": [],
   "source": [
    "%%time\n",
    "X_train_texts = tf_idf_texts.fit_transform(train_texts)\n",
    "X_valid_texts = tf_idf_texts.transform(valid_texts)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "_uuid": "bea0b7a09c57afefb77e5eabd2b8ed18bf019855"
   },
   "outputs": [],
   "source": [
    "X_train_texts.shape, X_valid_texts.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "_uuid": "b81a4ad703e2fb0f97f21a449af7e0d8b39a860c"
   },
   "outputs": [],
   "source": [
    "%%time\n",
    "X_train_subreddits = tf_idf_subreddits.fit_transform(train_subreddits)\n",
    "X_valid_subreddits = tf_idf_subreddits.transform(valid_subreddits)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "_uuid": "d55aec9753a506f41310b4b717cdbb8694af8a8e"
   },
   "outputs": [],
   "source": [
    "X_train_subreddits.shape, X_valid_subreddits.shape"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "lang": "fr"
   },
   "source": [
    "Ensuite, empilez toutes les caractéristiques ensemble."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "_uuid": "bc458646e3c36792cf845f86cb9b6d6cc384b32c"
   },
   "outputs": [],
   "source": [
    "from scipy.sparse import hstack\n",
    "\n",
    "X_train = hstack([X_train_texts, X_train_subreddits])\n",
    "X_valid = hstack([X_valid_texts, X_valid_subreddits])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "_uuid": "d5d2505567c9303b2ce9f81c9bdc11fc799af91e"
   },
   "outputs": [],
   "source": [
    "X_train.shape, X_valid.shape"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "lang": "fr"
   },
   "source": [
    "Entraînez de même la régression logistique."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "_uuid": "77b08444a449def113803365001d1f844620d5aa"
   },
   "outputs": [],
   "source": [
    "logit.fit(X_train, y_train)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "_uuid": "1a48a2e637db7e695766af5be6f5aa60596f6ad9"
   },
   "outputs": [],
   "source": [
    "%%time\n",
    "valid_pred = logit.predict(X_valid)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "_uuid": "b50af0bc6ea1297538daaff183b7eecf8dfa7c2c"
   },
   "outputs": [],
   "source": [
    "accuracy_score(y_valid, valid_pred)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "lang": "fr"
   },
   "source": [
    "Comme on peut le constater, la précision a légèrement augmenté."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "lang": "fr"
   },
   "source": [
    "## Liens:\n",
    "  - Machine learning library [Scikit-learn](https://scikit-learn.org/stable/index.html) (a.k.a. sklearn)\n",
    "  - Kernels on [logistic regression](https://www.kaggle.com/kashnitsky/topic-4-linear-models-part-2-classification) and its applications to [text classification](https://www.kaggle.com/kashnitsky/topic-4-linear-models-part-4-more-of-logit), also a [Kernel](https://www.kaggle.com/kashnitsky/topic-6-feature-engineering-and-feature-selection) on feature engineering and feature selection\n",
    "  - [Kaggle Kernel](https://www.kaggle.com/abhishek/approaching-almost-any-nlp-problem-on-kaggle) \"Approaching (Almost) Any NLP Problem on Kaggle\"\n",
    "  - [ELI5](https://github.com/TeamHG-Memex/eli5) to explain model predictions"
   ]
  }
 ],
 "metadata": {
  "hide_input": false,
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.5.4"
  },
  "nbTranslate": {
   "displayLangs": [
    "*"
   ],
   "hotkey": "alt-t",
   "langInMainMenu": true,
   "sourceLang": "en",
   "targetLang": "fr",
   "useGoogleTranslate": true
  },
  "toc": {
   "base_numbering": 1,
   "nav_menu": {},
   "number_sections": false,
   "sideBar": true,
   "skip_h1_title": false,
   "title_cell": "Table of Contents",
   "title_sidebar": "Contents",
   "toc_cell": false,
   "toc_position": {},
   "toc_section_display": true,
   "toc_window_display": false
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}
